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WITH A SHARED MEMORY COMMUNICATIONS AND CONTROL MEDIUM 
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An experimental distributed microprocessor subsystem is currently under 
development at the Naval Air Development Center as a vehicle to investigate 
distributed processing concepts with respect to replacing larger computers 
with networks of microprocessors at the subsystem or node level. Major bene- 
fits being exploited include increased performance, flexibility, system avail- 
ability, and survivability by use of multiple processing elements with reduced 
cost, size, weight and power consumption. 

This paper concentrates on defining the distributed processing concept in 
terms of control primitives, variables, and structures and their use in per- 
forming a decomposed DFT (Discrete Fourier Transform) application function. The 
DFT was chosen as an experimental application to investigate distributed pro- 
cessing concepts because of its highly regular and decomposable structure for 
concurrent execution. The design assumes interprocessor communications to be 
anonymous. In this scheme, all processors can access an entire common data- 
base by employing control primitives. Access to selected areas within the com- 
mon database is random, enforced by a hardware lock, and determined by task 
and subtask pointers. This enables the number of processors to be varied in the 
configuration without any modifications to the control structure. Decompositional 
elements of the DFT application function in terms of tasks and subtasks are also 
described. 

The experimental hardware configuration consists of IMSAI 8080 chassis which 
are independent, 8-bit microcomputer units. These chassis are linked together 
to form a multiple processing system by means of a shared memory facility. This 
facility consists of hardware which provides a bus structure to enable up to six 
microcomputers to be interconnected. It provides polling and arbitration logic 
so that only one processor has access to shared memory at any one time. For 
discussion purposes, five of the processors are designated as slaves and one as 
a master where each slave contains an identical copy of a control executive and 
application program tasks. In actual operation, the slave processors cooperate 
to compute the DFT where the master provides external input, output, and control 
functions. With this implementation, commands to perform a DFT iteration are 
provided through the master. 

It is expected that this concept will be tested and demonstrated on a lab- 
oratory model by the end of 1980. Evaluations will concentrate on areas such 
as performance comparisons based on varying the number of processors and bus 
contention factors as a function of local processing and common data base ac- 
cess times. Future work will focus on fault tolerant techniques that can be 
directly implemented and evaluated on the baseline laboratory model. 
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MOTIVATION 

• AVIONIC PROCESSING SYSTEMS ARE BECOMING MORE DISTRIBUTED IN 
ORDER TO EXPLOIT THE FOLLOWING MAJOR BENEFITS: 

- INCREASED SYSTEM- WIDE REAL TIME PERFORMANCE 

- EASE OF ADAPTABILITY TO INTEGRATION AND CHANGE 

- HIGH SYSTEM AVAILABILITY 

- DECREASED SYSTEM VULNERABILITY 

• BECAUSE OF REDUCED SIZE, WEIGHT, POWER CONSUMPTION AND COST 
ADVANTAGES, MICROPROCESSOR TECHNOLOGY WILL IMPACT AVIONIC 
PROCESSING SYSTEMS IN THE FOLLOWING AREAS: 

- INTERFACE AND HARDWIRED LOGIC REPLACEMENT APPLICATIONS 

+ - REPLACING LARGER COMPUTERS WITH NETWORKS OF SMALLER 
COMPUTERS 


MICROPROCESSOR TECHNOLOGY AND 
DISTRIBUTED PROCESSING 


REASONABLE COST-PERMITS EXPERIMENTING WITH CONCEPTS WHICH 
WOULD OTHERWISE BE PAPER STUDIES 

REDUCED SIZE, POWER, AND WEIGHT PERMITS APPLICATIONS THAT WOULD 
OTHERWISE NOT BE FEASIBLE 

LIFE CYCLE COSTS OFTEN MUCH LOWER THAN FORMER SOLUTIONS TO SAME 
PROBLEM 



GLOBAL/ LOCAL DISTRIBUTION 



APPROACH 

• EXPERIMENTAL INVESTIGATION 

• LABORATORY MODEL 

• OFF-THE-SHELF HARDWARE (MICROPROCESSORS ARE INEXPENSIVE) 

- MULTIPLE PROCESSORS 

- SHARED MEMORY FACILITY INTERCONNECT 

• EXPERIMENTAL CONTROL STRUCTURE 

- LOCAL KNOWLEDGE OF EXISTANCE OF OTHER PROCESSORS NOT 
REQUIRED 

- GLOBAL CONTROL AND TASK SCHEDULING VIA HIGHLY RELIABLE 
SHARED MEMORY 

• EXPERIMENTAL WELL-KNOWN APPLICATION-DFT 

• DEMONSTRATE CONCEPT FEASIBILITY 

• PERFORM TRADE-OFF ANALYSES 

• IDENTIFY AND IMPLEMENT FAULT-TOLERANT CONCEPTS 




EXPERIMENTAL HARDWARE CONFIGURATION 
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5 = SLAVE PROCESSOR 

M = MASTER PROCESSOR 
LM = LOCAL MEMORY 


SHARED 

MEMORY 


SHARED MEMORY FACILITY CONSTRAINTS 

• SIX PROCESSORS MAXIMUM 

• ROUND-ROBIN POLLING SCHEME 

• ONE BYTE ACCESSED PER POLL 

• FIXED LOCK-OUT TIME IN FUG BLOCK 


ASSUMPTIONS 


• MASTER PROCESSOR PERFORMS INTERFACE AND DISPLAY FUNCTIONS 

• SLAVE PROCESSORS PERFORM APPLICATION FUNCTION CONCURRENTLY 
AS DIRECTED BY MASTER PROCESSOR 

• LOCAL MEMORY 

- EACH SLAVE PROCESSOR CONTAINS IDENTICAL COPY OF PROGRAMS 

- CONTROL EXECUTIVE 

- APPLICATION TASKS 

• SHARED MEMORY 

- COMMON TO ALL PROCESSORS 

- CONTROL VARIABLES 

- APPLICATION DATA 

- ACCESSED BY CONTROL PRIMITIVES 

- ACCESS RIGHTS ENFORCED BY SEMAPHORES 

• VARYING NUMBER OF PROCESSORS DOES NOT AFFECT CONTROL STRUCTURE 
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TASK STRUCTURE 



SYSTEM CONTROL: SEMAPHORES 

• ENFORCES ACCESS RIGHTS TO SHARED MEMORY 

• USED TO INDICATE CONDITIONS 

- SHARED MEMORY BLOCKED 

- SHARED MEMORY AVAILABLE 

- ITERATION IN PROGRESS 


ITERATION COMPLETED 





CONTROL PRIMITIVES 




SEIZE 


RELEASE 


CONTROL VARIABLES 


8impt 

FORMAT: 


M Sb L Sfi 

[7T6l5]4|3l2l if Q| BIT POSITION (ONE BYTE) 


Bl TLP 

- Bl IS a 2 BIT SEMAPHORE AND INDICATES THE FOLLOWING CONDITIONS: 


SEMAPHORE 
8 I 


CONDITION 


0 0 

0 I 

» 0 
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SHARED MEMORY BLOCKED ANO ITERATION COMPLETED 
SHARED MEMORY BLOCKED AND ITERATION IN PROGRESS 
SHARED MEMORY AVAILABLE AND ITERATION COMPLETED 
SHAREO MEMORY AVAILABLE AND ITERATION IN PROGRESS 


TLP IS A 6-BIT TASK LIST POINTER THAT CAN POINT TO ANY ONE OF 64 TASKS 


TSs 


«se 


se 


- FORMAT: M 6 1 5 1 4 H oj BIT POSITION (ONE BYTE) 

-TS* IS AN 8-BIT WORD USED TO ASSOCIATE CORRESPONDING DATA WITH A TASK AND 
CAN TAKE ON 256 VALUES 


M se L Se 

FORMAT: 1 7 I 6 1 5 1 4 I 3 1 2 1 1 1°1 BIT POSITION (ONE BYTE) 


CTC IS AN 8-BIT CUMULATIVE TASK COUNTER. ONE CTC IS REQUIRED FOR EACH TYPE OF TASK BEING PERFORMED, 
i. ... TH E NUMBER O F CTC » ARE EQUAL TO THE NUMBER OF TASKS POINTED TO BY TLP. 










MASTER PROCESSOR CONTROL PRIMITIVES 


• SEIZEm PRIMITIVE 

THIS PRIMITIVE IS EXECUTED BY THE MASTER PROCESSOR WHEN ACCESSING SHARED MEMORY 



RELEASES SHARED MEMORY TO 
THE SLAVE PROCESSORS BY 
MEANS OF THE RELEASEm 
PRIMITIVE 


SLAVE PROCESSOR CENTROL PRIMITIVES 

• SEIZES PRIMITIVE 

THIS PRIMITIVE IS EXECUTED BY THE SLAVE PROCESSORS WHEN ACCESSING SHARED MEMORY 
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OTHER PROCESSORS CAN AC- 
CONTINIUE | | CESS SHARED MEMORY BUT NO 
VARIABLES CAN BE DISTURBED 
UNTIL ACCESSING PROCESSOR 
RELEASES SHARED MEMORY BY 
MEANS OE THE RELEASES 
PRIMITIVE 


• RELEASES PRIMITIVE 

THIS PRIMITIVE IS EXECUTED BY SLAVE 
PROCESSORS WHEN RELEASING SHARED MEMORY 






















DFT APPLICATION 


A DFT CAN BE DEFINED IN THE FOLLOWING MATRIX FORM: 

C = WF 
IF WE LET: 

• n = 0,1, J N-l = MATRIX ROW NUMBER AND FRESUENCT SHP 

• k = 0,1,1 kl = MATRIX COLUMN NUMBER AND TIME STEP 

• N = k BUT MAINTAINING n AND k NOTATIONS TO DISTINGUISH ROWS FROM COLUMNS 
THEN: 

• W IS AN N x k MATRIX CONSISTING OF THE TERMS 
W».k = ,< lflj/N)<nk MOD N) 

= COS [<lZ))nk MOD N)j ■ j SIN [Apink MOD N|) 

• F IS A kxl MATRIX REPRESENTING THE FUNCTION F(tk)T/2!il OVER THE TIME SPAN T 

EL-1 

• G IS AN Nxl MATRIX WHERE G<i = T/lTV £ W"> F(tk) 

IN EXPANDED FORM, G = WF CAN BE WRITTEN AS: k = 0 

GO \ /wO.O W 0,R-l W 0 - 1 W°. « '\ /MoT/1 TTR \ 

G1 \ / W'.» W I,X-| W<-2 W'.R-l \( FtlT/llTR \ 

G2 U W*.® W 2,R.| W2.2 W*. R-l II FIJI/! II I J 

GN-l/ \ w N-t.O W N I,R .1 W". 1 w H ',R-l/ \FtR-lT/lTI A' 

SINCE: 

Gfi.k (REAL) = COS ((2£j(nk MOD N)] F(lk) T/l nk 
Gn,k (IMAGINART) =-l {SIN ((A£)(nk MOD N)]}F(fk) T/2 IT k 
tiln = nAwWHERE 



THE AMPLITUDE/FREQUENCT VALUES CAN BE OBTAINED AS FOLLOWS: 
AMP(%) =Au|Gn| 
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AMPUTUDE 


INPUT/OUTPUT 



DFT DECOMPOSITION FOR TASK 1 


SUBTASK 1 OF N 



INPUT PROCESS OUTPUT 

(SHARED MEMORY) (LOCAL MEMORY p) (SHARED MEMORY) 
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DFT DECOMPOSITION FOR TASK 2 



N = K 
T 

FtK-1 
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, + \ l=XGN-l(IMAGINARY) 

SUBTASK K OF K 


DFT DECOMPOSITION FOR TASK 3 


SUBTASK 1 OF N 
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INPUT-OUTPUT OF TASK 2 
(SHARED MEMORY) 
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STATUS 


• IMPLEMENTATION 

• GCSS SIMULATION 

• LABORATORY EVALUATION 

• FAULT TOLERANT STUDIES 

- PROCESSOR 

- SHARED MEMORY 

- BUS 


RELIABILITY MODEL 


1 OF N 



— T- ' 


• TAKE ADVANTAGE OF MULTIPLE 
PROCESSORS 

• OPTIMIZE EXISTING CONTROL 
STRUCTURE FOR FAULT- 
TOLERANCE PURPOSES 


CURRENTLY SINGLE POINT FAILURES 

STUDIES TO IDENTIFY FAULT TOLERANT 
SCHEMES 

POSSIBLE IMPLEMENTATION OF HIGHLY 
RELIABLE SHARED MEMORY WOULD 
BE DUPLEXED CONFIGURATION 
EACH WITH SINGLE ERROR CORRECTION 
AND DOUBLE ERROR DETECTION 
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